nlp_architect.models.gnmt package

Submodules

nlp_architect.models.gnmt.attention_model module

Attention-based sequence-to-sequence model with dynamic RNN support.

class nlp_architect.models.gnmt.attention_model.AttentionModel(hparams, mode, iterator, source_vocab_table, target_vocab_table, reverse_target_vocab_table=None, scope=None, extra_args=None)[source]

Bases: nlp_architect.models.gnmt.model.Model

Sequence-to-sequence dynamic model with attention.

This class implements a multi-layer recurrent neural network as encoder, and an attention-based decoder. This is the same as the model described in (Luong et al., EMNLP‘2015) paper: https://arxiv.org/pdf/1508.04025v5.pdf. This class also allows to use GRU cells in addition to LSTM cells with support for dropout.

nlp_architect.models.gnmt.model module

Basic sequence-to-sequence model with dynamic RNN support.

class nlp_architect.models.gnmt.model.BaseModel(hparams, mode, iterator, source_vocab_table, target_vocab_table, reverse_target_vocab_table=None, scope=None, extra_args=None)[source]

Bases: object

Sequence-to-sequence base class.

build_encoder_states(include_embeddings=False)[source]

Stack encoder states and return tensor [batch, length, layer, size].

build_graph(hparams, scope=None)[source]

Subclass must implement this method.

Creates a sequence-to-sequence model with dynamic RNN decoder API. :param hparams: Hyperparameter configurations. :param scope: VariableScope for the created subgraph; default “dynamic_seq2seq”.

Returns:A tuple of the form (logits, loss_tuple, final_context_state, sample_id), where:
logits: float32 Tensor [batch_size x num_decoder_symbols]. loss: loss = the total loss / batch_size. final_context_state: the final state of decoder RNN. sample_id: sampling indices.
Raises:ValueError – if encoder_type differs from mono and bi, or attention_option is not (luong | scaled_luong | bahdanau | normed_bahdanau).
decode(sess)[source]

Decode a batch.

Parameters:sess – tensorflow session to use.
Returns:
A tuple consiting of outputs, infer_summary.
outputs: of size [batch_size, time]
eval(sess)[source]

Execute eval graph.

get_max_time(tensor)[source]
infer(sess)[source]
init_embeddings(hparams, scope)[source]

Init embeddings.

train(sess)[source]

Execute train graph.

class nlp_architect.models.gnmt.model.Model(hparams, mode, iterator, source_vocab_table, target_vocab_table, reverse_target_vocab_table=None, scope=None, extra_args=None)[source]

Bases: nlp_architect.models.gnmt.model.BaseModel

Sequence-to-sequence dynamic model.

This class implements a multi-layer recurrent neural network as encoder, and a multi-layer recurrent neural network decoder.

nlp_architect.models.gnmt.model_helper module

Utility functions for building models.

nlp_architect.models.gnmt.model_helper.get_initializer(init_op, seed=None, init_weight=None)[source]

Create an initializer. init_weight is only for uniform.

nlp_architect.models.gnmt.model_helper.get_device_str(device_id, num_gpus)[source]

Return a device string for multi-GPU setup.

nlp_architect.models.gnmt.model_helper.create_train_model(model_creator, hparams, scope=None, num_workers=1, jobid=0, extra_args=None)[source]

Create train graph, model, and iterator.

nlp_architect.models.gnmt.model_helper.create_eval_model(model_creator, hparams, scope=None, extra_args=None)[source]

Create train graph, model, src/tgt file holders, and iterator.

nlp_architect.models.gnmt.model_helper.create_infer_model(model_creator, hparams, scope=None, extra_args=None)[source]

Create inference model.

nlp_architect.models.gnmt.model_helper.create_emb_for_encoder_and_decoder(share_vocab, src_vocab_size, tgt_vocab_size, src_embed_size, tgt_embed_size, embed_type='dense', dtype=tf.float32, num_enc_partitions=0, num_dec_partitions=0, src_vocab_file=None, tgt_vocab_file=None, src_embed_file=None, tgt_embed_file=None, use_char_encode=False, scope=None)[source]

Create embedding matrix for both encoder and decoder.

Parameters:
  • share_vocab – A boolean. Whether to share embedding matrix for both encoder and decoder.
  • src_vocab_size – An integer. The source vocab size.
  • tgt_vocab_size – An integer. The target vocab size.
  • src_embed_size – An integer. The embedding dimension for the encoder’s embedding.
  • tgt_embed_size – An integer. The embedding dimension for the decoder’s embedding.
  • dtype – dtype of the embedding matrix. Default to float32.
  • num_enc_partitions – number of partitions used for the encoder’s embedding vars.
  • num_dec_partitions – number of partitions used for the decoder’s embedding vars.
  • scope – VariableScope for the created subgraph. Default to “embedding”.
Returns:

Encoder’s embedding matrix. embedding_decoder: Decoder’s embedding matrix.

Return type:

embedding_encoder

Raises:

ValueError – if use share_vocab but source and target have different vocab size.

nlp_architect.models.gnmt.model_helper.create_rnn_cell(unit_type, num_units, num_layers, num_residual_layers, forget_bias, dropout, mode, num_gpus, base_gpu=0, single_cell_fn=None)[source]

Create multi-layer RNN cell.

Parameters:
  • unit_type – string representing the unit type, i.e. “lstm”.
  • num_units – the depth of each unit.
  • num_layers – number of cells.
  • num_residual_layers – Number of residual layers from top to bottom. For example, if num_layers=4 and num_residual_layers=2, the last 2 RNN cells in the returned list will be wrapped with ResidualWrapper.
  • forget_bias – the initial forget bias of the RNNCell(s).
  • dropout – floating point value between 0.0 and 1.0: the probability of dropout. this is ignored if mode != TRAIN.
  • mode – either tf.contrib.learn.TRAIN/EVAL/INFER
  • num_gpus – The number of gpus to use when performing round-robin placement of layers.
  • base_gpu – The gpu device id to use for the first RNN cell in the returned list. The i-th RNN cell will use (base_gpu + i) % num_gpus as its device id.
  • single_cell_fn – allow for adding customized cell. When not specified, we default to model_helper._single_cell
Returns:

An RNNCell instance.

nlp_architect.models.gnmt.model_helper.gradient_clip(gradients, max_gradient_norm)[source]

Clipping gradients of a model.

nlp_architect.models.gnmt.model_helper.create_or_load_model(model, model_dir, session, name)[source]

Create translation model and initialize or load parameters in session.

nlp_architect.models.gnmt.model_helper.load_model(model, ckpt_path, session, name)[source]

Load model from a checkpoint.

nlp_architect.models.gnmt.model_helper.avg_checkpoints(model_dir, num_last_checkpoints, global_step, global_step_name)[source]

Average the last N checkpoints in the model_dir.

nlp_architect.models.gnmt.model_helper.compute_perplexity(model, sess, name)[source]

Compute perplexity of the output of the model.

Parameters:
  • model – model for compute perplexity.
  • sess – tensorflow session to use.
  • name – name of the batch.
Returns:

The perplexity of the eval outputs.

Module contents